Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group field caps shard requests per node #77047

Merged
merged 10 commits into from
Sep 27, 2021

Conversation

jtibshirani
Copy link
Contributor

@jtibshirani jtibshirani commented Aug 31, 2021

Currently to gather field caps, the coordinator sends a separate transport
request per index. When the original request targets many indices, the overhead
of all these sub-requests can add up and hurt performance. This PR switches the
execution strategy to reduce the number of transport requests: it groups
together the index requests that target the same node, then sends only one
request to each node.

Addresses #74648.

@jtibshirani jtibshirani force-pushed the field-caps branch 3 times, most recently from a10af92 to 26427e9 Compare August 31, 2021 18:35
@jtibshirani
Copy link
Contributor Author

This approach makes compromises to keep the implementation simple. I'm looking for feedback on these decisions:

  • Before, if a shard copy unexpectedly failed while retrieving field caps, we retried on the next copy. Now that index will just be marked as "failed" in the response. This seemed okay to me, I wasn't sure we needed this kind of "live retry" logic in field caps?
  • Previously when there was an index_filter, we tried each shard in the index one after the other, stopping when one of them matched. Now we fan out to all of the index shards preemptively, which means the nodes do redundant work.

Also -- I opened this PR against 7.x since the BWC logic can affect the design. If merged I'll forward-port to master.

* Loads the mappings for an index and computes all {@link IndexFieldCapabilities}. This
* helper class performs the core shard operation for the field capabilities action.
*/
class FieldCapabilitiesFetcher {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class lets us use the same per-shard logic for both the new and old execution strategies. It's not completely necessary to add it -- I could have shuffled some inner classes around to let us share this logic. However I found this to be a nice abstraction. It helps breaks up TransportFieldCapabilitiesAction, which is complex, and opens the door to adding unit tests for field caps (which I hope to in a follow-up).

List<ShardId> shardIds = entry.getValue();

DiscoveryNode node = clusterState.getNodes().get(nodeId);
assert node != null;
Copy link
Contributor Author

@jtibshirani jtibshirani Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can node be null? I didn't think so, but TransportFieldCapabilitiesIndexAction has some logic to handle this case:

In general the logic around failures is pretty tricky. I'd like to write more tests for failure cases once I know the overall approach is okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a valid assertion. If the node is null, we should fail the request the same way since the node may vanished at any time. In the current model, that's ok because we'd try another replica but the new model requires to try these shards on potentially more than node. So unless we consider that we don't want to retry on replica, we'll need to handle failures differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a conversation offline where we decided the current strategy (where we don't retry failures) was not acceptable. For example if a shard is being relocated, the coordinator could have an out-of-date view of where it is, and the field caps request would fail. It's challenging for clients to handle these partial failures.

Maybe in an immediate follow-up, we can add support for a retry step, where we do another round of node requests to retry on other shards/ replicas. Like the other one around reducing responses, this follow-up would need to ship in the same release. For now in this PR I'd just replace the incorrect assertion with an error.

@jtibshirani jtibshirani marked this pull request as ready for review August 31, 2021 23:32
@jtibshirani jtibshirani added :Search/Search Search-related issues that do not fall into other categories >enhancement v8.0.0 labels Aug 31, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Aug 31, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@ywelsch ywelsch self-requested a review September 1, 2021 11:19
@dnhatn
Copy link
Member

dnhatn commented Sep 8, 2021

Thanks @jtibshirani.

Previously when there was an index_filter, we tried each shard in the index one after the other, stopping when one of them matched. Now we fan out to all of the index shards preemptively, which means the nodes do redundant work.

I am not sure if it's a good approach because it will perform heavy works such as RBAC auth and iterating fields multiple times. As we retrieve field caps from any matching shard of an index, I think we can pass a list of indices instead in a node-level request. The receiving node then exhaustedly tries to all shards (one by one) of requesting indices until it finds matching copies. The coordinating node will retry on another node for indices without matching shards. WDYT?

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic makes sense to me. I left some comments regarding the merging of responses and the bwc.

class FieldCapabilitiesNodeResponse extends ActionResponse implements Writeable {
private final String[] indices;
private final List<FieldCapabilitiesFailure> failures;
private final List<FieldCapabilitiesIndexResponse> indexResponses;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do a pre-merge locally to reduce the size of the response ? It seems wasteful to send back the entire list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I think this is nice to do as a follow-up to keep this PR simpler. Since it will affect the wire format, we should make the follow-up in this same release. I'll figure out how to do that (maybe we could have a short-lived feature branch).


// If all nodes are on version 7.16 or higher, then we group the shard requests and send a single request per node.
// Otherwise, for backwards compatibility we follow the old strategy of sending a separate request per shard.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to take the remote cluster into account. Would it be simpler to make the decision on a per-connection level ? I think it's "ok" to emulate the old model by translating the node request into multiple shard request on older connections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding this part. Could you explain why we need to take the remote cluster into account?

@Override
public void messageReceived(final FieldCapabilitiesNodeRequest request,
final TransportChannel channel,
Task task) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fork the execution outside of the network thread ? The list of shards can be large on some tiers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question! What do you think would be the right thread pool (maybe 'management')?

List<ShardId> shardIds = entry.getValue();

DiscoveryNode node = clusterState.getNodes().get(nodeId);
assert node != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a valid assertion. If the node is null, we should fail the request the same way since the node may vanished at any time. In the current model, that's ok because we'd try another replica but the new model requires to try these shards on potentially more than node. So unless we consider that we don't want to retry on replica, we'll need to handle failures differently.

@jtibshirani jtibshirani changed the base branch from 7.x to group-field-caps September 27, 2021 18:45
@jtibshirani
Copy link
Contributor Author

I discussed with @dnhatn and @jimczi and decided to merge the PR into a feature branch group-field-caps so we can follow up with more PRs. Some follow-ups we're tracking:

  • Handle shard failures through a retry mechanism
  • Reduce each node response before sending it back to coordinator
  • Move node-level logic to a separate threadpool (?)

We also discussed @dnhatn's concern about the strategy for index_filter. We thought it was the best trade-off to stick with the current strategy: it keeps the logic simple, and we're not too concerned about the extra work, as it is bounded by the number of nodes.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Julie! Let's merge this work to the feature branch.

@jtibshirani jtibshirani merged commit 598ce64 into elastic:group-field-caps Sep 27, 2021
@jtibshirani jtibshirani deleted the field-caps branch September 27, 2021 19:45
dnhatn added a commit that referenced this pull request Oct 14, 2021
This adds a retry mechanism for node level field caps requests 
introduced in #77047.
dnhatn pushed a commit to dnhatn/elasticsearch that referenced this pull request Oct 15, 2021
Currently to gather field caps, the coordinator sends a separate transport
request per index. When the original request targets many indices, the overhead
of all these sub-requests can add up and hurt performance. This PR switches the
execution strategy to reduce the number of transport requests: it groups
together the index requests that target the same node, then sends only one
request to each node.
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Oct 15, 2021
This adds a retry mechanism for node level field caps requests 
introduced in elastic#77047.
dnhatn added a commit that referenced this pull request Oct 15, 2021
Currently to gather field caps, the coordinator sends a separate transport
request per index. When the original request targets many indices, the overhead
of all these sub-requests can add up and hurt performance. This PR switches the
execution strategy to reduce the number of transport requests: it groups
together the index requests that target the same node, then sends only one
request to each node.

Relates  #77047
Relates # #78647


Co-authored-by: Julie Tibshirani <[email protected]>
dnhatn added a commit that referenced this pull request Oct 15, 2021
Currently to gather field caps, the coordinator sends a separate transport
request per index. When the original request targets many indices, the overhead
of all these sub-requests can add up and hurt performance. This PR switches the
execution strategy to reduce the number of transport requests: it groups
together the index requests that target the same node, then sends only one
request to each node.

Relates  #77047
Relates # #78647

Co-authored-by: Julie Tibshirani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants